Round Overview
Discuss this match
EqualSubstrings2
Problem Details
This is an implementation problem, we can just try all possible starting and ending indices for the first substring. The second substring needs to have a starting index after the first substring’s final index. The lengths must be equal. From this, we only need to try O(|L|^3) pairs of substrings. Checking if the substrings are equal is O(|L|) for a total of O(|L|^4) complexity.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
int get(string s) {
int cnt = 0;
int n = s.size();
// first string uses indices [a,b) , and second uses indices [c,d)
for (int a = 0; a < n; a++) {
for (int b = a + 1; b <= n; b++) {
for (int c = b; c < n; c++) {
int d = c + b - a;
if (s.substr(a, b - a) == s.substr(c, d - c)) {
cnt++;
}
}
}
}
return cnt;
}
Let’s start with the left-most stack. It will either receive some stacks from its right neighbor or give stacks to it. Since this stack only has one neighbor, it’s one of the two easiest stacks to analyze. Let’s try: The stack currently has a[0] stones and needs to have b[i] stones. We should add variables named has and needs. There are three possibilities:
If (has = needs) then we need to do nothing regarding this stack.
If (has > needs) then we need to move p = “has” - “needs”) to the right neighbor (Index 1). This adds a cost of p to the number of moved stones.
If (has < needs) then we need to move “needs” - “has” stones from the right neighbor. For practical purposes, let’s still use p = “has” - “needs”) (which in this case would be negative). Then the cost is -p. A negative number means we needed to take from the next stack.
After processing the 0-th stack we can try the next stack i = 1. We have p to remember what happened with the previous stack. If p is positive, then we moved p stones from stack i-1 to stack i, so the number of stones in the stack is (“has” = a[i] + p) and we need it to have (“need” = b[i]). If p is negative, then we need to take -p stones from stack i, so we initially have (“has” = a[i]) stones and we need to reach (“need” = b[i] - p) stones (p is negative, so -p is positive). Later we take -p stones and the new number of stones will be b[i] which is what we want. If p = 0, then we have a[i] stones and want b[i] stones. In all 3 cases we have two variables: “has” and “needs” which are the same situation we had earlier, since stone i-1 is already processed, we can just solve the rest ignoring it, so we can use the same logic, find a new p and update the cost. Repeat for all i.
After we process the last stack, what does p mean? It means either the amoung to take from or give to an unexistent stack n. If p is not 0, this means that the case is invalid, the total amount of stones in a[] and b[] are inconsistent.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
int get(vector < int > a, vector < int > b) {
int n = a.size();
int p = 0;
int cost = 0;
for (int i = 0; i < n; i++) {
int has, needs;
if (p >= 0) {
has = a[i] + p;
needs = b[i];
} else {
has = a[i];
needs = b[i] - p;
}
p = has - needs;
cost += abs(p);
}
return (p == 0) ? cost : -1;
}
SubtreesCounting
Problem Details
We need to enumerate all subtrees and add up their sizes. Usually when we are asked to find a sum of sizes as opposed to just counting, it’s a trick to spice up a problem that already has a known solution for just counting. Let’s start there, how do we count all the subtrees in this tree?
The idea is to use a recursive solution. For each node i in the subtree, we should be able to find the number of subtrees that have i as a root. i may have some children. One way to solve this issue is to index the children of i and have a function f(i, t) which give us the number of subtrees rooted at i that consider only the first t children of i.
When t = 0, this is the same as if the node had no children at all. When the node has no children, there is only one subtree rooted at the node. So the result is 1.
When t != 0, we can consider child # t-1, let’s name this child j. We know that when we consider only the first t-1 children, the number of subtrees is f(i, t-1). There are f(j, “total_children”(j)) subtrees that start with j and we can combine each of those subtrees with the f(i, t-1) subtrees that are rooted at i and include the previous children. This would be: f(i, t-1) times f(j, “total_children”(j)), but we are also allowed not to include j in the subtrees, so the result is: f(i, t-1) times (f(j, “total_children”(j))+1)
With this logic we can find the number of subtrees rooted at each node i. The sum of all this results is the number of subtrees.
The problem of finding the sum of all the subtree sizes is similar to the counting problem. There is just an extra detail to consider. Let’s define g(i, t) as the sum of sizes of subtrees rooted at i that consider only the first t children of i:
When t = 0, this is once again the same as if the node had no children. There is only 1 subtree and it has 1 node, the sum is 1.
When t != 0, we combine the results for child # t-1 which will still be named j and the results of g(i,t-1). The question is how. The trick is to also consider the number of subtrees f(). Imagine we had the following values:
a : The number of subtrees rooted at i considering only the first t-1 children: f(i, t-1).
b : The sum of the sizes of the previous subtrees: f(i, t-1).
c : The number of subtrees rooted at j that we must combine: f(j, “total_children”(j))+1.
d : The sum of the sizes of those subtrees: g(j, “total_children”(j))
Note that we are including the subtree that doesn’t include j, this is why c has an extra +1. Since this subtree is empty, it doesn’t change d.
A way to find the a formula combined sum of sizes in this case is to first try an easy example in which the individual sizes are known. Imagine that one of the cases to combine has 3 subtrees of sizes 4, 5, and 7 so that a = 3 , b = 4+5+7 and the other case has 2 subtrees of sizes 9 and 11, so that d = 9 + 11. When combining them, we will have 6 different mixes with the following sizes:
4 + 9
4 + 11
5 + 9
5 + 11
7 + 9
7 + 11
We want to find the sum of all these values. Note that 4, 5 and 7 are added c= 2 times each and 9 and 11 are added a = 3 times each. So we have (4 + 5 + 7) * c + (9 + 11) *a = b*c + a*d which is a formula that doesn’t need to know the individual sizes. So we can do g(i,t) = a*d + b*c.
We are ready to code a solution, we can use iterative dynamic programming to calculate f() and g()for each individual subtree starting from the highest index to the lowest index (the nodes are top-sorted in the input).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
const int MOD = 1000000007;
int sumOfSizes(int n, int a0, int b, int c, int m) {
vector < vector < int >> g(n);
vector < int > parent(n);
parent[0] = -1;
vector < int > a(n);
a[0] = a0;
for (int i = 1; i < n - 1; i++) {
a[i] = (int)((b * (long) a[i - 1] + c) % m);
}
for (int i = 1; i <= n - 1; i++) {
parent[i] = a[i - 1] % i;
g[parent[i]].push_back(i);
}
vector < vector < long >> sum_trees(n);
vector < vector < long >> count_trees(n);
long res = 0;
for (int i = n - 1; i >= 0; i--) { // For node i:
sum_trees[i].resize(g[i].size() + 1);
count_trees[i].resize(g[i].size() + 1);
for (int t = 0; t <= g[i].size(); t++) {
if (t == 0) {
count_trees[i][t] = 1;
sum_trees[i][t] = 1;
} else {
int j = g[i][t - 1];
long a = count_trees[i][t - 1];
long b = sum_trees[i][t - 1];
long c = 1 + count_trees[j][g[j].size()];
long d = sum_trees[j][g[j].size()];
count_trees[i][t] = (count_trees[i][t - 1] * (1 + count_trees[j][g[j].size()])) % MOD;
sum_trees[i][t] = ((a * d) % MOD + (b * c) % MOD) % MOD;
}
}
// add to the result, the result needs all subtrees, not just those
// rooted at 0
res += sum_trees[i][g[i].size()];
}
return (int)(res % MOD);
}
Let’s begin with an easier version of the problem, the setup is not a circle: In the division 2 version of this problem ,the stacks are in a line and the last stack is not adjacent with the first stack. The solution to that problem is rather simple and needs O(n) time.
The solution to this cyclic version comes upon realizing that the stones don’t need to move between more than n stacks, the moves in such a situation would be redundant. So there will be two adjacent stacks that don’t need to share stones at all. Those two stacks can be used as the start and the end stacks of a solution like the division 2 version. So we can just try all O(n) candidates for those stacks and return the case that minimizes the number of moves:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
const long INF = numeric_limits < long > ::max();
long div2_get(vector < int > a, vector < int > b)
// mostly the division 2 solution, but returns INF instead of -1 and uses 64 bits integers
{
int n = a.size();
long p = 0;
long cost = 0;
for (int i = 0; i < n; i++) {
long has, needs;
if (p >= 0) {
has = a[i] + p;
needs = b[i];
} else {
has = a[i];
needs = b[i] - p;
}
p = has - needs;
cost += abs(p);
}
return (p == 0) ? cost : INF;
}
long get(vector < int > a, vector < int > b) {
int n = a.size();
long res = INF;
for (int i = 0; i < n; i++) {
res = std::min(res, div2_get(a, b));
// rotate the two arrays:
a.push_back(a[0]);
a.erase(a.begin());
b.push_back(b[0]);
b.erase(b.begin());
}
return (res >= INF) ? -1 : res;
}
In problems that involve the GCD and/or LCM it sometimes helps to see each integer in terms of its prime factorization. For example, consider a single prime number p and two integers x and y. p appears in a integer’s prime factorization as p raised to an exponent r. ( r could be 0 when the integer is not a multiple of p). So for example : 12 = 2^2 * 3, so the exponent of prime 2 is 2.
If the exponents of prime p in x and y are r_x and r_y, respectively, then the exponent of prime p in the “GCD”(x,y) is: min(r_x, r_y).
Likewise, the exponent of p in “LCM”(x,y) is max(r_x, r_y).
The allowed operations consist of taking two integers x and y and replacing them with their GCD(x,y) and LCM(x,y). So let’s consider only the exponents r_x and r_y of a prime number p, then the operation consists of taking r_x and r_y and replacing them with min(r_x,r_y) and max(r_x,r_y). Consider the relationship to sorting: You compare two numbers r_x, r_y and reorder them.
Now that we know what happens to prime factors in a single operation, we need to tinker about how to find the optimal sequence of operations.
Note what happens to all prime factors when applying the operation to two integers. All prime factors will get the maximum exponent in one of the numbers and the minimum exponent in the other. The problem asks us to find the maximum possible sum of all the numbers after doing these operations, an idea is to get the maximum LCMs possible. . Let’s imagine there’s a single prime number. Remember that each operation sorts the pair of picked integers, repeating the operations for all possible pairs will take us to a sorted sequence (bubble sort) and this is the optimal solution. Not only that, but this will apply to all prime factors as well. So in the optimal solution, all exponents of the prime factors are sorted, so that for each p the , i-th number in the sequence will be a multiple of p ^ (r_i) where r_i is the i-th smallest exponent for the prime number among all the original numbers.
The new issue is how to quickly find and sort all those prime factor exponents and calculate each of the numbers based on the sorted prime exponents. By the constraints, there will be at most 100000 numbers, in 500 distinct arithmetic progressions. None of the numbers will exceed 10000000. The relevant prime numbers might be any prime number between 2 and 10000000. There are 664579 such prime numbers.
The first step is to factorize up to 100000 numbers smaller than 10000000. This can be done quite quickly with a modified Sieve of Eratosthenes. Instead of just saving if each integer is prime or composite, store the largest prime factor you’ve found so far for the integer. Then when the time comes to factorize a given integer x, we can grab a prime factor p using the Sieve’s results, repeatedly divide x / p to find the exponent then repeat with the new x’ which will have a new prime factor.
We can find the exponents for each prime factor for each integer in the input. Now we need to sort the exponents. Doing this for each of the 100000 numbers for 667579 prime numbers is too costly. The trick is to note that most integers are only multiples of a few prime factors. So most of the 667579 prime numbers will have mostly zeros for exponents and only a few of the integers will have non-zero exponents for that prime in their factorization. We can do the multiplication only for those integers that have a non-zero exponent for the given prime number. This can be accomplished with a heap data structure. When factorizing the number, store, for each prime number, the non-zero exponents we find in a heap. Then when the time comes to find the final values of each integer, use the heap to find the list of the greatest exponents and multiply. See code for more information:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
const int MOD = 1000000007;
int getMaximalSum(vector < int > start, vector < int > d, vector < int > cnt) {
const int SIEVE_MAX = 10000000;
vector < int > first_prime_factor(SIEVE_MAX + 1, -1);
for (int i = 2; i <= SIEVE_MAX; i++) {
if (first_prime_factor[i] == -1) {
first_prime_factor[i] = i;
for (int j = i + i; j <= SIEVE_MAX; j += i) {
first_prime_factor[j] = i;
}
}
}
// for each possible prime factor, a heap of exponents:
map < long, multiset < int, greater < int >> > primes_exponents;
vector < long > res;
for (int i = 0; i < start.size(); i++) {
int x = start[i];
for (int j = 0; j < cnt[i]; j++) {
res.push_back(1);
int y = x;
int p = first_prime_factor[y];
while (p != -1) {
int k = 0;
while (y % p == 0) {
k++;
y /= p;
}
// add k to p's heap of exponents
primes_exponents[p].insert(k);
// get next prime factor from sieve's table
p = first_prime_factor[y];
}
x += d[i];
}
}
// for each of the prime factors:
for (auto q: primes_exponents) {
long p = q.first;
// get the respective heap
auto & exps = q.second;
int k = res.size() - 1;
for (int x: exps) {
// multiply the prime factor to res[i]s according to the heap's
// contnets.
for (int i = 0; i < x; i++) {
res[k] = (res[k] * p) % MOD;
}
k--;
}
}
// get the sum:
long res_sum = 0;
for (long x: res) {
res_sum += x;
}
return (int)(res_sum % MOD);
}